```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE)
library(readr)
titanic.train <- read_csv(“train.csv”)
View(titanic.train) ```
So, I have the prediction variable in numeric and categorical variables that explained below…
Survival: who was survived for 0 = No and 1 = Yes
Sex: Gender Male and female
Pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
Age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
Sibsp: The dataset defines family relations in this way…
Sibling : brother, sister, stepbrother, stepsister
Spouse :husband, wife (mistresses and fiancés were ignored)
Some children travelled only with a nanny, therefore parch=0 for them.
Parent : mother, father
Child : daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
Embarked :Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
Cabin :Cabin number
The summary shows the some numeric variables can do a distribution such as Age, Fare and some of category variables are characters such as Embarked, Sex, Cabin.Then some of the numeric variables are discrete data such as Pclass, Survived,Parch, Sibsp. Finally,some variables just show the information but it is not related to analyse the datasets are Passenger Id, Ticket and Fare.
From the table 1.1 above,the original Titanic datasets has been modified to include a subset of variables, including Age, Sex, Passenger Class, Port of Embarkation, Survival status, SibSp (Siblings/Spouses), and Parch (Parents/Children). The table is divided into two sections based on the “Survived” variable: one for passengers who survived ( = 1) and another for passengers who did not survive ( = 0) that shows in the count and percentages.
From the table1.2 above shows The Passenger in the 1st class has the oldest age about (38.23 +- 14.80) and The Passenger in the 3rd class has the youngest age about (25.14 +- 12.49). Minimum of the age is in the 3rd class about 0.42 years old and Maximum of the age is in the 1st class about 80 years old.
From the table1.3 above shows male are older age than female (30.72 +- 14.80) and female are younger age than male about (27.91 +- 14.11). Minimum and maximum of the age is male about 0.42 and 80 years old.
From the table1.4 above shows who was embarked from Cherbourg has the oldest age about (30.72 +- 14.80) and who was embarked from Queenstown the youngest age about (28.08 +- 16.91). Minimum of the age about 0.42 years old was embarked from Chourbourg and Maximum of the age about 80 years old was embarked from the Southampton. 2 passengers are not applicable.
From the table 1.5 above, shows was not survived has the older age than who was survived (30.62 +- 14.17) and who was suvived has the younger than age about (28.34 +- 14.95).
From graph 2.1 above, Shows the male in the 3rd class are the most passenger in the titanic sorted by Gender and Class in percentage about 38.94% and female in the 2nd class is the least passenger about 8.52%
From graph 2.2 above, Shows the male who was embarked at Southampton are the most passenger in the titanic sorted by Gender and Embarked in percentage about 49.60%
From graph 2.3 above, Shows the percentage that female was survived more than male from Titanic sorted by Gender about 26.15.%. Most of male died about 52.52% .
From graph 2.4 above, Shows the percentage that 1st class was survived more than 2nd and 3rd class from Titanic sorted by Passenger class about 15.26% and the most of 3rd class died about 41.75%
From graph 2.5 above, Shows the percentage that who was embarked as Southampton was survived more than Cherbourg and Queenstown from Titanic sorted by Embarkation about 24.40% and most of them died about 48.03%
From graph 2.6 above, Shows the range age of the passenger in titanic sorted by Gender. The range about 21 -30 years old of male is the most passenger in titanic about 20.86% and Most of female 11.34%
From graph 2.7 above, Shows the range age of the passenger in titanic sorted by Class. The range of not applicable in 3rd class is the most passenger in titanic about 18.06 %
From graph 2.8 above, Shows the range age of the passenger in titanic sorted by Embarked. The range of 21-30 years old from Southampton is the most passenger in titanic about 25.56%
From graph 2.9 above, Shows the range age of the passenger in titanic sorted by who was survived. The range of 21-30 years old was survived the most about 11.76% and most of them died about 20.44%
From graph 2.10 above, Shows the family relations(Siblings and spouse) of the passenger in titanic sorted by who was survived. The passenger who was not have a family relations was survived the most about 23.56% and mostly of them died about 44.66%.
From graph 2.11 above, Shows the family relations(Parents and children) of the passenger in titanic sorted by who was survived. The passenger who was not have a family relations was survived the most about 26.15% and mostly of them died about 49.94%.
From the graph3.1 above Shows the range of distribution between Age and who was survived.
From the graph3.2 above Shows the range of distribution between Age and and Passenger class who was survived.
From the graph3.3 above Shows the range of distribution between Age and Embarkation and who was survived.
https://rstudio-pubs-static.s3.amazonaws.com/143316_106d643df86c4e4c8ae20e9775ab0ec7.html
https://www.kaggle.com/competitions/titanic
https://www.danieldsjoberg.com/gtsummary/
https://www.r-bloggers.com/2016/02/titanic-machine-learning-from-disaster-part-1/
https://medium.com/analytics-vidhya/titanic-dataset-analysis-80-accuracy-9480cf3db538